Conversation
* Switch to codecov * Solve flaky behave test -- 9.6 doesn't maintain replication slots on replicas
Clarify that a warning is changed to DEBUG only when the watchdog setting is _not_ set to required.
) `/sync` key wasn't updated after renaming the leader node with Patroni restart in pause (without Postgres restart). It prevented Patroni from promoting after the next restart without pause. Close patroni#3449
extra is called psycopg3, not psycopg
…roni#3457) Such timeline increase may happen as a result of crash recovery in a single user mode + promote after taking a leader key while other replica nodes are isolated from DCS. In this case replica nodes didn't trigger pg_rewind state machine because the leader and therefore `primary_conninfo` did't change.
Existing link is invalid: https://patroni.readthedocs.io/en/latest/replication_modes.rst (from https://patroni.readthedocs.io/en/latest/ha_multi_dc.html ) Signed-off-by: George Melikov <[email protected]>
Removed `member-name` from `edit-config` command. Would give impression it is possible to set up node specific DCS configuration. Minor documentation issue I noticed when upgrading from standalone cluster to Patroni.
I followed the switch to systemd notify introduced in patroni#3301, but receive the following warning: ``` systemd[1]: patroni.service: Got notification message from PID 2572251, but reception only permitted for main PID 2572234 ``` With `Type=notify` systemd implies `NotifyAccess=main` which restricts notify socket access only to the main process. Apparently patroni tries to send from a different one. This PR lifts this restriction and allows all process of the service to access the socket. see https://www.freedesktop.org/software/systemd/man/latest/systemd.service.html#NotifyAccess= > If `all`, all services updates from all members of the service's control group are accepted. This option should be set to open access to the notification socket when using `Type=notify`/`Type=notify-reload` or `WatchdogSec=` (see above). If those options are used but `NotifyAccess=` is not configured, it will be implicitly set to main.
in addition to that did some maintenance: - removed leftovers from python 2 (from \_\_future\_\_ import) - improved psycopg tests - removed usage of f-strings and str.format from logging calls - switched from multiprocessing.pool.ThreadPool to concurrent.futures.ThreadPoolExecutor - actions/checkout -> v5 --------- Co-authored-by: Jorge Solorzano <[email protected]>
`tag.failover_priority` values were ignored when `synchronous_node_count>1`. Besides that document limitation regarding `failover_priority` with `synchronous_mode=quorum`. Close patroni#3496 --------- Co-authored-by: Hugo DUBOIS <[email protected]>
- handle broken JSON responses - improve reporting for etcd internal errors Close patroni#3305, patroni#3473
ThreadPoolExecutor and as_completed() may return results in unpredictable order. We better not check node names.
Starting from PostgreSQL 10 we use passfile in primary_conninfo and we failed to update passfile after replication password was updated in patroni.yaml with reload. Close patroni#3470
…atroni#3518) For PostgreSQL v12 and newer, `pg_settings` cannot be queried while the server is still starting and not yet accepting connections. As a workaround, we update `self._current_recovery_params` when writing a new `postgresql.conf` file. However, this logic did not account for the fact that `self._current_recovery_params` must contain all recovery parameters for correct comparison in `check_recovery_conf()`. To address this, the missing recovery parameters are now added to `self._current_recovery_params` in `write_recovery_conf()`, mirroring the behavior of `_read_recovery_params_pre_v12()`. Additionally, restore the `Postgresql.is_starting()` check in `Ha.is_healthiest_node()`, which was mistakenly removed in patroni#2726. Close patroni#3517
The link was to the YAML configuration, but `use_slots` and `slots` are dynamic configuration items.
Corrected the parameter name from Patroni `synchronous_mode` to PostgreSQL `synchronous_commit`.
We pinned them because of some incompatibilities which were later addressed.
dcs.cluster wasn't properly mocked
…oni#3537) Fixes patroni#3533 When `initdb` or `basebackup` options are provided as a dict (instead of a list), the `option_is_allowed()` validation was bypassed, allowing blocked options like `compress` to be used. Added the `option_is_allowed(key)` check to the dict branch in `process_user_options()`. --------- Co-authored-by: Muhammad Umair Ali <[email protected]> Co-authored-by: Alexander Kukushkin <[email protected]>
The `compress` option was completely blocked for basebackup, but since PostgreSQL 15, server-side compression is useful and works transparently with plain format. - Removed `compress` from blocked options list - Added validation to allow only `--compress=server*` values (e.g., server-zstd, server-gzip) - Reject client-side compression with helpful error message
patroni#3556) Documentation update
The response JSON containing error code and message may vary depending on where it is raised from. Follow up on patroni#3338 and patroni#3486
Avoid starting/stopping threads in runtime: 1. Always start slot advance thread and CitusHandler thread on Patroni start. 2. Introduce thread pool for REST API and use it for running REST API itself and executing incoming HTTP requests. 3. Introduce a global thread pool to execute async tasks and for making REST API requests during leader race and failsafe checks. 4. Allow configuring global `thread_pool_size` and `restapi.thread_pool_size`. Besides that, adjust system(d) start script to run Patroni with `MALLOC_ARENA_MAX=1` to reduce allocated virtual memory and add informational warning sections to README.rst and docs. Close patroni#3474 Close patroni#3481
and discard new config if it is not. Ref patroni#3546
Ha.shutdown() needs it to run some health checks Ref patroni#3526
or at least reduce it to minimum
Before patroni#3526 thread was started with the first attempt to sync metadata, where we had guarantees that citus database is prepared (extension exists). Early start caused enormous amount of errors. We return to old behavior by using `self._ready_to_run = Event()`
**Problem Description:** When the Kubernetes API returns a 403 Permission Denied error (e.g., due to temporary RBAC permission loss), the current code immediately logs an exception and returns False, which may cause the leader to be incorrectly demoted. However, in real-world scenarios, permission issues can be temporary (such as RBAC updates, network fluctuations), and the application should be given an opportunity to recover within a timeout period. **Solution:** Add a dedicated _handle_permission_denied method that, when encountering a 403 error: 1. Continuously verifies the leader status within the retry_timeout period 2. Checks the leader object every 0.5 seconds to confirm whether the current instance is still the leader 3. Returns one of three states based on the verification result: 'retry': Still the leader, continue retrying the update operation 'demote': Leadership has been lost, demote immediately(return False) 'timeout': Unable to confirm within the timeout period, handle as timeout(return False) This PR is to fix issue patroni#3536 . Co-authored-by: Sophia Ruan <[email protected]>
v3.6.9, v3.5.28, and v3.4.42 addressed some CVEs and now it is no longer possible to read cluster topology and perform lease keepalive requests without authentication. Close patroni#3573
time.sleep(0.001) is very unreliable and makes tests flaky
- Release notes - Update version - Pyright 1.1.408
This allows systemd-reload command to wait for the configuration to actually have been processed. Encapsulate the import logic in a separate function to be used by the already present notify implementation.
When systemd receives unexpected notifications it may terminate Patroni unit. Close patroni#3586
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.